- Meta just launched Movie Gen, an AI video generator to compete with OpenAI's Sora.
- Movie Gen can make videos with accompanying audio using a text prompt. It can also edit by prompt.
- Meta joined the video generation race later than OpenAI and Google.
Meta released a new AI video-generating tool on Friday that is also the company's latest volley in its battle with OpenAI for AI supremacy.
"Today, we're excited to premiere Meta Movie Gen, our breakthrough generative AI research for media, which includes modalities like image, video, and audio," the company said in a press release. "Movie Gen outperforms similar models in the industry across these tasks when evaluated by humans."
In its press release, Meta called Movie Gen the "most advanced and immersive storytelling suite of models," including video generation, audio generation, personalized video generation, and video editing. The models were trained using publicly available data and licensed data, the company said.
With a text prompt, Movie Gen can create videos up to 16 seconds long at 16 frames a second while reasoning "about object motion, subject-object interactions, and camera motion." Users can upload a picture of themselves to be incorporated into personalized videos, and Movie Gen can edit videos with text instructions from the user.
Meta's example video shows an underwater perspective of a baby hippo (Moo Deng reference, anyone?) happily swimming around in a serene aquatic scene.
Another shows a koala on a surfboard and the accompanying prompt: "A fluffy koala bear surfs. It has a gray and white coat and a round nose. The surfboard is yellow. The koala bear is holding onto the surfboard with its paws. The koala bear's facial expression is focused. The sun is shining."
With audio generation, users can "create and extend sound effects, background music or entire soundtracks" up to 45 seconds long, the press release said. An example clip of a snake slithering through a wooded area includes the prompt: "Rustling leaves and snapping twigs, with an orchestral music track."
Meta is a little late to the audio- and video-generation game as top competitors like OpenAI and Google have already claimed a foothold in the space. OpenAi launched Sora, its video generator, in February, and Google followed suit with Veo in May.
Meta, however, has given OpenAI a run for its money in the AI arms race. Though OpenAI's ChatGPT debuted first and launched the company to worldwide fame, recent iterations of Meta's Llama model have been well received. Many viewed Llama 3.1, which came out in July, as superior to OpenAI's GPT-4o, which came out shortly before.
Meta says its new "state-of-the-art models" outperform competitors in A/B human comparisons. For video generation, those Meta surveyed preferred Movie Gen over OpenAI Sora, the company's press release said. Meta didn't share an A/B comparison with Google's Veo, which also offers sound effects and music, but Meta said in a lengthy accompanying research paper that it believes Google's video-to-audio generation models may be more limited in length than Meta's.
Meta, OpenAI, and Google did not immediately respond to a request for comment.